This vignette shows you how to collect gameids for a specific team during a specific time and aggregate corresponding box score information.
Specifically, it will walk you through collecting information on all Phoenix Suns games that took place during December and January of the 2006-07 season.
In [1]:
import goldsberry
import pandas as pd
pd.set_option("display.max_columns",50) # Change Pandas Display Options
goldsberry.__version__
Out[1]:
In [2]:
gameids = goldsberry.GameIDs()
gameids2015 = pd.DataFrame(gameids.game_list())
gameids2015.head()
Out[2]:
Like the PlayerList() class, the GameIDs() class defaults to the current season. If we want the 2006-07 season, we need to identify and change the proper parameters. We can see the available parameters to set by printing the api_params attribute.
In [3]:
gameids.api_params
Out[3]:
From there, we can see we should set the Season value to 2006-07. Once we set the parameter, we need to get new data and then save the new data as a data frame to a new object.
In [4]:
gameids.get_new_data(Season='2006-07')
gameids2006 = pd.DataFrame(gameids.game_list())
gameids2006.head()
Out[4]:
A quick filter for team names that contain 'Suns' returns all games for the Suns for the 2006-07 season
In [5]:
suns_logs = gameids2006.ix[gameids2006['TEAM_NAME'].str.contains('Suns')]
suns_logs.head()
Out[5]:
We can verify all of the games are there by checking the shape of the data frame.
In [6]:
suns_logs.shape
Out[6]: